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ABSTRACT 

Robust  Statistics  provides  a fresh  approach  to  the  difficult 
problem  of  editing  in  data  reduction.  Of  prime  concern  are  grossly 
erroneous  measurements  which,  when  undetected,  completely  destroy 
automated  data  reduction  procedures  causing  costly  reruns  and 
time  delays  with  human  detection  of  the  erroneous  measurements. 

The  application  of  robust  statistical  methods  has  been  highly 
successful  in  dealing  with  this  problem.  An  introduction  to 
the  robust  M-estimates  and  their  numerical  computation  is  given. 

The  application  of  M-estimates  to  data  preprocessing,  instrument 
calibration,  N-station  cinetheodolites , N-station  radar  solution, 
and  filtering  are  described  in  detail.  Numerical  examples  of 
these  applications  using  real  measurements  are  given. 
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INTRODUCTION 

Robust  statistics  provides  a new  approach  to  data  editing  in  trajectory 
data  reduction.  Data  editing,  whose  function  is  to  deal  with  wild  obser- 
vations, has  been  a most  frustrating  problem  for  the  data  analyst.  The 
use  of  robust  statistics  has  been  highly  successful,  much  more  so  than 
previous  methods,  in  dealing  with  this  problem.  There  are  several  appli- 
cations of  robust  statistics  to  data  editing  in  trajectory  data  reduction. 

The  applications  considered  here  are: 

Data  Preprocessing 
Instrument  calibration 
N-station  Cine  solution 
N-station  radar  solution 
Filtering 

Before  describing  these  applications  we  need  to  answer:  What  are  robust 
statistics  and  how  do  robust  statistics  apply  to  data  editing?  In  answer 
to  the  first  part  of  the  question  robust  statistical  methods  are  those 
which  tend  to  retain  their  desirable  properties  under  at  least  mild 
violations  of  the  assumptions  under  which  they  were  derived.  Possibly 
a more  useful  description  of  a robust  statistical  procedure  is  one  which 
will  perform  well  under  a variety  of  underlying  distribution  functions 
or  in  the  presence  of  observations  from  contaminating  distributions. 

Thus,  the  sample  median  is  a more  robust  procedure  than  the  sample  mean 
for  estimating  the  mean  of  a symmetric  parent  distribution,  if  even  a 
moderate  amount  of  contamination  from  a long  tailed  distribution  is  present. 

In  answer  to  the  second  part  of  the  question  we  are  probably  not  very 
concerned  about  the  performance  of  data  reduction  procedures  under  a wide 
variety  of  underlying  distribution  functions  of  the  observations  but  are 
mainly  concerned  about  the  performance  of  our  methods  in  the  presence 
of  observations  from  contaminating  distributions,  i.e.,  outliers.  Thus, 
in  data  reduction  we  are  interested  in  the  development  of  robust  statistical 
methods  which  are  highly  outlier  resistant.  In  data  reduction  we  are 
usually  interested  in  estimating  the  parameters  in  some  postulated  linear 
or  nonlinear  model  of  a process.  Thus,  in  data  reduction  we  are  specifically 
interested  in  developing  methods  for  linear  and  nonlinear  regression 
which  are  insensitive  to  a large  percentage  of  outlying  observations. 

Many  sources  of  outliers  are  present  in  trajectory  measuring  systems. 

Without  going  into  any  detail,  these  sources  may  be  broadly  grouped  into 

the  categories  of  equipment  malfunction,  outside  interference,  and  human  error. 

The  usual  methods  of  least  squares,  optimally  weighted  least  squares, 
maximum  likelihood,  etc.,  used  in  data  reduction  for  estimating  parameters 
in  a regression  model  are  rendered  useless  by  the  presence  of  outliers. 

To  quote  Huber  [l],  "even  a single  grossly  outlying  observation  may  spoil 
the  least  squares  estimate  and  moreover  outliers  are  much  harder  to  spot 
in  the  regression  case  than  in  the  simple  location  case." 


Although  the  history  of  robust  estimation  goes  back  to  the  19th  century, 
the  development  of  robust  regression  methods  is  just  currently  becoming  a 
popular  topic  for  statistical  research.  Some  of  the  earliest  methods  for 
robust  regression  were  developed  in  the  1950's,  notably  the  methods  reported 
by  Brown  and  Mood  [9]  and  by  Theil  [5].  Robust  estimation  methods  have  been 
classified  by  Huber  [l]  and  [2],  Huber's  classifications  are  termed  L-estiraates, 
M-estimates,  and  R-estimates.  The  L-estimates  are  estimates  which  are  linear 
combinations  of  the  order  statistics.  The  a-trimmed  mean  is  an  example  of  an 
L-estimate  for  a simple  location  parameter.  The  R-estimates  are  estimates 
derived  on  the  basis  or  rank  tests.  The  estimate  of  location  obtained  by 
taking  the  median  of  all  pairwise  averages  of  the  observations  is  an  R-estimate. 
Of  the  three  classifications  for  robust  estimates  given  by  Huber  we  shall  only 
be  concerned  with  M-estimates  in  this  report.  The  reason  for  this  is  not  that 
M-estimates  are  superior  but  because  we  are  only  interested  in  describing  the 
applications  of  robust  regression  to  data  reduction  and  this  seemed  easiest 
to  do  with  the  M-estimates. 

Given  the  linear  model 
P 

y = E x 0 + e i=l,  N (1) 

1 j=l  3 3 

the  regression  parameters  0^  are  to  be  estimated.  The  M-estimates  of 
0^  minimize 

N 

1 p(yi  ■ z xijV  (2) 

i=l  j J 3 

where  p (•)  is  some  function  which  is  often  convex.  Differentiating  with 
respect  to  0 leads  to 

N 

E xT*(y  - l x 0 ) = 0 (3) 

i=l  1 1 j ij  3 


*1  = col(xJL1,  xi2, xip) 


where 


and  i|»  (•)  is  the  derivative  of  p (•).  (3)  is  the  analog  of  the  normal  equations 

in  least  squares  regression.  The  estimate  which  results  from  solving  (3) 
is  called  an  M-estimate.  Rather  than  specifying  the  function  p,  M-estimates 
are  usually  described  by  specifying  the  function  <| 1>.  If  f(y;0)  is  the 
probability  density  function  underlying  the  observations,  and  if 


then  the  M-estimate  obtained  is  the  maximum  likelihood  estimate.  Since 
the  function  p is  usually  not  homogeneous,  as  it  would  be  in  least  squares, 
the  M-estimates  obtained  would  usually  not  be  scale  invariant.  Hence, 
to  force  scale  invariance  we  minimize 


N 

l P 
i=l 


yi ' ? Vj 


where  s is  some  measure  of  dispersion  of  the  residuals, 

The  measure  s also  needs  to  be  a robust  measure  of  dispersion. 


Several  iJj  functions  have  been  proposed  in  the  literature.  Basically 
those  4>  functions  fall  into  two  classes,  the  redescending  type  and  non- 
redescending  type.  We  will  only  consider  one  member  of  each  of  these 
classes  in  this  report.  The  original  ^ function  proposed  by  Huber  is  of 
the  non-redescending  type.  This  function  is 


<Kx) 


x |x|<a 

a sgn(x)  |x|>a 


(5) 


An  example  of  a f function  of  the  redescending  type  is  the  function  proposed 
by  Hampel  [6], 


a sgn(x) 


Other  \p  functions  have  been  proposed  by  Andrews  [8],  Tukey  [3],  and 
Ramsay  [4],  There  are  also  a number  of  other  methods  for  robust  regression 
see  [ll]. 


An  attractive  feature  of  least  squares  regression  is  the  ease  of 
numerical  solution.  One  might  be  inclined  to  think  that  the  numerical 
solution  for  an  M-estimate  would  in  many  cases  be  prohibitive.  At  worst 
(4)  can  be  minimized  by  one  of  the  many  algorithms  for  minimization,  e.g 
the  Fletcher-Powell  method  [lO], 


An  iterative  solution  such  as  a Gauss-Newton  method  can  easily  be 
applied  to  minimize  (4).  Setting  the  derivative  of  (4)  to  zero  and 

*{k} 

linearizing  about  an  arbitrary  point  9 in  the  iteration  sequence  gives 


yields 


Solving  (7)  for  0 


where 


(9) 


The  iterative  application  of  (8)  and  (9)  results  in  a fairly  simple  method 
for  obtaining  an  M-estimate.  An  approximate  sample  covariance  for  this 
estimate  is  given  by 


. N 

cov (0)  = v = i r 
n-p  i=l 


M_1  (a  x^)  h"1 


An  even  simplier  numerical  method  and  one  which  has  achieved  considerable 
popularity  for  obtaining  M-estimates  is  the  iterative  application  of  weighted 
least  squares.  Setting  the  derivative  of  (4)  with  respect  to  0 equal  to 


Then  (12)  is 


(14)  can  be  solved  iteratively  as  follows.  Starting  at  an  arbitrary 

a(k} 

point  0 in  the  sequence  of  iterations,  ' replace  (14)  by 


* { k } T 

l W .(0lK')X  (y 
i=l 

Solving  (15)  for  0^*  + ^ 


+ 1})=  Q 


(15) 


(16) 


Thus,  we  can  use  an  ordinary  weighted  least  squares  algorithm  iteratively 
to  obtain  the  M-estimate, 

Throughout  the  discussion  of  M-estimates  we  have  used  the  dispersion 

measure  s of  the  residuals  y^  - X^0  without  consideration  for  its  computation. 

Robust  dispersion  measures  are  often  taken  to  be  a multiple  of  the  inter- 
quartile range  or  of  some  other  range  statistic  of  a set  of  residuals.  A 
dispersion  measure  which  seems  to  be  most  popular  with  those  using  M-estiinates 
is  the  median  deviation  or  the  MAD  (Median  of  the  Absolute  Deviations) 
estimate  as  it  is  sometimes  called.  The  MAD  estimate  used  above  is  defined  by 


S'*  + « 


1N  A { K } T 
l W (0lK'r)XTX . 

j=i  j J J 


w.(0{K})x^yi 


s = med|r  | / .6745  (17) 

i 1 ' 

where  r^  = y^  - X^0.  Hampel  [6]  has  shown  that  the  MAD  estimate  is 
the  most  robust  estimate  of  dispersion. 

Both  the  Gauss-Newton  method  and  the  weighted  least  squares  method  for 
obtaining  M-estimates  are  iterative  and  therefore  require  a starting 
solution.  The  required  closeness  of  the  starting  solution  to  the  final 
solution  is  dependent  on  the  application  and  the  type  of  function  used. 
Quite  often  an  ordinary  unweighted  least  squares  solution  is  a sufficiently 
good  starting  solution.  In  some  cases  it  is  necessary  to  use  a starting 
solution  which  is  more  robust,  see  [ 12 ] . 


APPLICATION  TO  DATA  PREPROCESSING 

It  is  this  application  which  provided  our  original  motivation  for 
the  development  and  application  of  robust  statistical  methods  in  data 
reduction.  There  are  several  possible  functions  of  data  preprocessing. 
Ambiguities  in  phase  measurements  might  be  resolved  by  preprocessing.  It 
might  be  used  merely  to  detect  outliers  in  the  measurement  data  Decause 
their  detection  in  the  main  processor  might  be  considerably  more  difficult. 
Also,  the  main  processor  often  requires  the  use  of  weights  for  each  of  the 
measurements  or  the  main  processor  might  require  that  a set  of  meas- 
surements  be  synchronized  before  processing.  These  requirements  can  be 
fulfilled  by  data  preprocessing. 

Given  the  time  history  of  a particular  measurement  function  for  its 
entire  span  of  observation  on  a trajectory,  the  preprocessing 
function  divides  the  interval  of  observation  into  equal  segments  of  T 
seconds  except  for  a final  segment  either  shorter  or  longer  that  T.  Over 
each  of  these  segments  a polynomial,  usually  a quadratic,  is  fit  to  the 
measurements.  Alternatively,  a cubic  spline  might  be  fit  to  the  entire 
span  of  measurement  data  using  the  end  points  of  the  T second  segments  as 
knot  times.  Thus,  in  measurement  preprocessing  we  might  model  the  ath 
measurement  over  an  arbitrary  interval  of  the  trajectory  as 

ya(ti)  = °0  + 0l(ti  " *>  + 02(t±  - t)2  i=l,  N (18) 

where 

_ N 

t = 1 E t. 

N i=l  1 

Using  some  robust  M-estimate  of  the  parameter  vector  0 = [0^  0^,  02 ] 
we  would  minimize 


Q = ? pfW  - 0O  - 9 

i=i  V 


1(ti  - t)  ©2(ti  - 


!>!) 


which  upon  differentiating  gives  the  analog  of  the  normal  equations 


<1 


t 


We  solve  (20)  iteratively  to  obtain  robust  estimates  0q,  0^,  0 ^ 

In  the  iterative  solution  of  (20)  s is  taken  to  be  the  median  of  absolute 
residuals 


where 


s = med  I r (t . ) 
i “ 1 


6745 


Tv  2 


r«(ti>  = W - °0  " °l(ti  - ‘ 02(ti  - fc) 


The  following  data  set  is  from  a real  data  reduction  situation.  The 
measurements  are  a sequence  of  azimuth  measurements  from  a cine. 


OBSERVATIONS 

RESIDUALS 

FROM  ROBUST  FIT 

RESIDUALS  FROM 
LEAST  SQUARES  FIT 

-1.70987 

.000012 

-.157774 

-1.70942 

-.000004 

-.000204 

-1.70893 

.000003 

.105480 

-1.70845 

-.000015 

.159227 

-1.70793 

-.000010 

.161087 

-1.70741 

-.000021 

.111021 

-1.70682 

.000022 

.009099 

-1.70626 

.000019 

-.144780 

-1.70571 

-.000010 

-.350595 

-1.70510 

.000005 

-.608277 

-1.70449 

.000004 

-.917885 

1.43777 

3.141637 

1.862231 

1.44602 

3.149243 

1.456410 

-1.70257 

-.000007 

-2.158177 

1.44667 


3.146558 


473139 


There  are  three  obvious  outliers  in  the  data.  The  residuals  from  an 
ordinary  least  square  fit,  which  are  given  in  the  last  column  yield 
no  information  about  outliers  in  the  data.  The  residuals  from  the 
robust  fit  which  were  obtained  using  a Hampel  ^ function  (breakpoints  2.5, 
5.0,  7.5)  show  exactly  which  observations  were  outliers.  Outliers  can 
be  detected  as  those  residuals  r^  for  which  r^ks.  The  dispersion 

s may  be  saved  for  use  in  making  weights  for  the  observations  in  the  main 
processing.  Another  example  of  data  preprocessing  is  provided  by  the  40 
point  data  sequence  below. 


i 


* 


LEAST 

NORMALIZED 

SQUARES 

ROBUST 

ROBUST 

RESIDUALS 

RESIDUALS 

OBSERVATION 

RESIDUALS 

1 

-.011022 

-.000278 

.20642275 

1.005559 

2 

-.009071 

-.000006 

.20973521 

.020803 

3 

-.007471 

-.000033 

.21296912 

.120171 

4 

-.005711 

.000151 

.21663652 

.546808 

5 

-.004461 

-.000123 

.22006619 

.445501 

6 

-.002730 

.000136 

.22425138 

.492246 

7 

-.001590 

-.000144 

.22811853 

.519552 

8 

-.000213 

-.000135 

.23249603 

.487267 

9 

.001201 

-.000038 

.23718297 

.136926 

10 

.002489 

-.000014 

.24201791 

.051970 

11 

.003798 

.000082 

.24714760 

.297949 

12 

.005624 

.000748 

.25306741 

2.703007 

13 

.005421 

-.000564 

.25723122 

2.037977 

14 

.008660 

.001617 

.26510980 

5.845255 

15 

.006010 

-.002037 

.26737381 

7.361710 

16 

.009663 

.000662 

.27621340 

2.394020 

17 

.011016 

.001113 

.28302583 

4.023731 

18 

.010359 

-.000392 

.28810282 

1.418292 

19 

.011568 

.000019 

.29531815 

.067036 

20 

.012005 

-.000291 

.30203451 

1.051051 

21 

.012861 

-.000129 

.30944403 

.464413 

22 

.013557 

-.000075 

.31696650 

.269818 

23 

.014001 

-.000222 

.32450959 

.800901 

24 

.014501 

-.000260 

.33238295 

.938668 

25 

.015039 

-.000209 

.34056693 

.754131 

26 

.015433 

-.000249 

.34888132 

.898506 

27 

.015913 

-.000152 

.35755414 

.547835 

28 

.016283 

-.000113 

.36639033 

.406971 

29 

.016494 

-.000181 

.37534057 

.654202 

30 

-.058265 

-.075167 

.30959446 

271.639732 

31 

-.172487 

-.189565 

.20465789 

685.052254 

32 

.018472 

.001270 

.40517605 

4.589248 

33 

-.064416 

-.081690 

.33212063 

295.211357 

34 

.089274 

.071980 

.49591643 

260.122231 

35 

-.251831 

-.269092 

.16519139 

972.446930 

36 

.007852 

-.009326 

.43552655 

33.701152 

37 

.159606 

.142564 

.59820610 

515.197899 

38 

.059168 

.042313 

.50896735 

152.912771 

39 

.016704 

.000088 

.47797510 

.318960 

40 

.016296 

-.000028 

.48931307 

.101770 

1 


The  solution  for  the  M-estimate  used  a least  square  starting  solution  and 
a Hampel  ip  function  with  breakpoints  at  2.5,  5,  and  7.5.  In  the  list  of 
least  square  residuals  given  above  some  of  the  outliers  are  obvious  while 
others  are  not.  The  column  of  normalized  residuals  is  merely  the  robust 
residual  divided  by  the  robust  dispersion  measure  s.  If  we  declare  that 
residuals  greater  than  2.5s  are  outliers  then  we  would  flag  observations 
12,  14,  15,  17,  30,  31,  32,  33,  34,  35,  36,  37,  and  38  as  outliers.  Some 
of  these  outliers  are  much  more  gross  than  others.  The  M-estimate  of  the 

AAA 

parameter  vector  is  0^  = .20388,  0^  = .05419,  0£  = .04427.  This 

example  is  simulated  data  so  that  the  true  parameter  vector  is  known  to 
be  0n  = .20397,  0 = .0537,  0 = .0445.  The  least  squares  starting 

U | l 1 Z r y r ■» 

solution  was  0"°  = .21636,  010  ^ = .01901,  010  ^ = .05466. 

INSTRUMENT  CALIBRATION 


Surveyed  targets  are  used  for  calibrating,  i.e.,  estimating  the 

coefficients  in  an  error  model,  for  radars,  cinetheodolites  or  laser  trackers. 

Suppose  for  example  we  have  M surveyed  targets  for  a laser  tracker.  Let 

R , A . , E . be  the  surveyed  range,  azimuth,  and  elevation  for  the  jth  target, 
sj  SJ  SJ 

Suppose  that  multiple  observations  of  the  targets  are  available  so  that  we 

nuth 

Let 


have  Nj  observations  for  the  jth  target.  Denote  these  range,  azimuth  and 


elevation  observations  by  R..,  AJ , and  E^ . , i 

ij  i j ij 


1,  N » j - 1,  M. 


AR.  . 

= R . 

- R , 

T 

= r .0  + 

I 

c 1. 

ij 

sj 

J 

aa 

= A . . 

- A , 

T 

ij 

ij 

sj 

= a^0  + 

AE . . 

= E,  • 

“ E , 

T 

= e .0  + 

ij 

ij 

sj 

j 

where  0 is  an  unknown  parameter  vector  and  r , a ^ , and  e^  are  known 
vectors.  A common  model  for  r^ , a^  and  e is  given  by 
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(21) 

(22) 


sj 


(23) 


gives  a robust  estimate  0.  Since  the  elements  of  the  parameter  vector 
are  usually  small,  the  elements  of  the  starting  solution  0^°^  may  be  set 
to  zero  except  for  0^°^,  03°^*  an<*  ©y^which  can  be  set  to  the  medians 
of  , and  E^  , respectively. 


The  following  example  illustrates  the  application  of  M-estimates  to 
the  calibration  of  a laser  tracker  using  real  field  data.  The  laser  tracker 
is  calibrated  by  using  azimuth  and  elevation  observations  from  eight  reflective 
targets  arranged  in  a circular  pattern  around  the  tracker  at  a range  of  about 
2500  feet.  We  use  the  error  model  given  in  (21)  - (23).  Since  the  elevations 
of  the  eight  calibration  targets  are  approximately  equal,  it  is  obviously 
impossible  to  estimate  0 in  (22)  without  additional  observations. 

D 

In  order  to'  provide  these  additional  observations  we  observe  the  same 
calibration  targets  but  with  the  tracker  "dumped",  i.e.,  with  an  azimuth 
of  approximately  A + 180  and  an  elevation  of  approximately  Eg^ 

- 180  . These  additional  observations  are  called  dumped  readings  and 
are  treated  as  additional  calibration  targets.  Also,  we  can  see  from 
(21)  that  we  will  be  unable  to  estimate  using  the  eight  calibration 

targets  since  the  ranges  to  all  targets  are  approximately  equal.  In  order 
to  estimate  0^  we  observe  four  additional  calibration  targets  with 

ranges  varying  from  20000  feet  to  60000  feet.  In  this  example  dumped 
readings  were  not  available  so  that  0^  could  not  be  estimated.  Also, 

data  from  two  of  the  close  targets  are  missing.  Approximately  250 
observations  are  available  for  each  of  the  remaining  target  boards. 


A Hampel  t p function  which  was  defined  in  (6)  was  used  for  this  example. 
The  parameters  or  break  points  of  the  Hampel  iji  in  this  example  are  a = 2.5, 
b = 5.0,  c = 7.5.  The  results  of  this  robust  calibration  are  summarized 
in  the  figure  below  by  tabulating  the  number  of  residuals  for  each  target 
lying  in  each  region  of  the  Hampel  t/>.  We  show  only  the  positive  side  of 
the  ij>  function  with  the  number  of  residuals  in  each  region  being  the  sum 
of  the  numbers  of  residuals  in  the  positive  and  corresponding  negative  side 
of  the  1 1>  function. 
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Thus,  we  can  see  that  a significant  percentage  of  the  range  observations 
from  the  long  range  targets  are  bad  including  all  range  observations  from 
target  number  12.  The  parameter  estimates  for  this  example  are  0^  * 

1.69  feet,  0.  = .508145  x 10~4,  0.  = .11405  mr  0.  = .511565  mr, 

2 3 4 

0,.  = -.105925  mr,  0^  = -.04014  mr.  The  least  squares  calibration  for 
this  example  gives  the  erroneous  values  for  the  range  calibration  0^  = 
-395.27  feet  and  ©2  = -.96948  x 10~2. 


N-STATION  CINETHEODOLITE  SOLUTION 

The  N-station  Cine  solution  is  a standard  problem  in  data  reduction. 

In  this  situation  we  are  given  azimuth  observations  a^Ct^)  and  elevation 

observations  e^t^),  a = N^,  from  cines  at  each  time  point  t^  along  a 

trajectory.  From  these  cines  we  must  estimate  the  cartesian  positions 

xCt^),  yCtj),  zCt^),  at  each  time  point.  The  observations  are  aa(t^)  = 

A (xj  + error  and  e (tj  = E (x . ) + error.  The  measurement  functions 
a _i  _ a i a i _ 

A (x.)  and  E (x.)  are  functions  of  the  position  vector  x.  = [x(t.)  y(t^)  z(t 
o 1 a i i i i 

These  measurement  functions  are  given  by 


W 


tan-1  x(tl)  ' xq 


(30) 


Eo(x)  - tan~^ Z(tl)  ' za (31) 

[(xCt,)  - xa>2  + (y(ti)  - ya)2ll/2 

where  (x^,  yQ,  z^)  is  the  cartesian  position  of  the  a1"*1  cine.  The  usual 

least  square  problem  to  estimate  the  position  xCt^),  yCt^),  z(t^)  is 

nonlinear.  Thus,  the  robust  estimation  of  these  quantities  will  be  nonlinear 
both  because  the  objective  function  for  the  robust  estimation  problem  is 
non-quadratic  and  because  the  measurement  model  is  a nonlinear  function  of 
the  parameters  to  be  estimated.  The  usual  least  squares  solution  would 
minimize 


An  M-estimate  of  the  position  vector  x^  would  minimize 


l P 
a=l 


Differentiating  (33)  gives 
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where 


ra(a)  = aa(ti)  ' Aa(xi) 


re(a)  = ea(t±)  - Eo(x±) 


(34)  can  be  rewritten  as 
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CT(x  ) = 1 9Aa  1 3Ea(xi^ 

“ Sa  9xi  Se  a*± 


a 3 x 2 matrix  and 


s s 

a e 


J ra^a^  | cos^ea(t^) 
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An  iterative  solution  of  (35)  with  s ^ = med|ra(a)  |y  .6745,  sg  = med | r£(a)  |^.6745 
gives  a robust  estimate  of  the  parameter  vector  x^. 

As  an  example  or  robust  estimation  applied  to  a cinetheodolite  solution 
consider  the  following  situation  which  is  rather  extreme  but  sometimes  occurs. 

A missile  is  fired  at  a drone  and  cinetheodolites  are  observing  both  the 
missile  and  drone.  It  is  required  to  provide  a cine  derived  trajectory 
on  both  the  missile  and  the  drone.  Due  to  an  inadvertent  clerical  error 
one  of  the  cines  which  was  actually  observing  the  missile  was  erroneously 
listed  as  observing  the  drone.  Obviously,  when  doing  a least  squares 
solution  to  obtain  the  drone  trajectory,  the  azimuth  and  elevations  from 
one  cine  will  be  gross  outliers  and  may  destroy  the  least  squares  solution 
for  the  drone  position  coordinates.  A single  point  example  of  this  situation 
is  furnished  by  the  actual  cine  data  given  below 


ine 

Obs.  Azimuth 

Obs.  Elevation 

1 

.568106 

.338886 

2 

-.626010 

.122620 

3 

-2.665036 

.359168 

4 

1.926249 

.327177 

Cine  2 is  the  one  which  is  actually  tracking  the  missile  rather  than  the 
drone.  Obviously,  as  in  most  situations  which  are  the  nonlinear,  there  is 
no  way  of  distinguishing  the  outliers  by  inspecting  the  observations.  As 
always  in  robust  estimation  a preliminary  solution  is  required  to  start  the 

iteration.  Let  (x^,  y , z a)  be  a position  solution  obtained  from  the 

pair  of  cines.  In  this  example  we  have  six  possible  pairs  of  cines  so  that 

a = 1,  6.  We  then  start  the  iteration  with  (x°,  y°,  z°)  where  x°  = med  x^, 

a=l,  6 

y = med  y^,  z = med  z^.  For  the  example,  the  median  guess  solution 
a=l,  6 a=l,  6 

is  x°  = -45147.9  ft.,  y°  = 87423.8  ft.,  z°  = 11117.3  ft.  After  five  iterations 
the  sequence  has  converged  to  the  solution  x = 32964.8  ft.,  y = 87425.2  ft., 
z = 11114.9  ft.  The  residuals  from  the  final  solution  are 


Residuals 


( ? 


Cine 


1 

2 

3 

4 


Thus,  the  robust  solution  using  the  Hampel  \p  with  breakpoints  of  3,  6,  9, 
correctly  identified  the  outliers.  Let  us  carry  this  example  farther. 

Suppose  we  have  no  observations  from  Cine  1,  i.e.,  we  have  data  from  only 
three  cines  one  of  which  is  bad.  In  this  case  our  starting  solution  turns 

out  to  be  x°  = 45147.9,  y°  = 87424.1,  z°  = 11120.2.  After  four 
iterations  the  solution  has  converged  to  x = -32966,  y = 87424.6,  z = 11115.3. 
Thus,  we  are  again  able  to  correctly  identify  the  bad  cine.  Now  suppose  we 
have  data  from  cines  1,  2,  3.  In  this  case  the  initial  guess  solution  is 

x°  = 45147.9,  y°  = 67033.9,  z°  = 11118.9.  After  ten  iterations  the 
solution  is  x = -35023.9,  y = 84462.1,  z = 11004.1.  The  solution  eventually 
converges  to  the  correct  value,  but  slowly.  A third  possibility  to  have 
data  from  only  three  cines  is  obervations  from  cines  1,  2,  4.  In  this  case 

the  guess  solution  is  x°  = -46454.3,  y°  = 87548.3,  z°  = 7262.7.  After 
three  iterations  the  solution  has  converged  to  x = -35392.6,  y = 86464.3, 
z * 1044.8.  Thus,  in  this  case  the  iteration  has  converged  to  the  wrong 
solution.  In  the  last  two  cases  where  the  solution  converged  very  slowly  and 
converged  to  the  wrong  solution,  the  starting  solution  was  too  far  from  the 
correct  solution.  If  a sufficiently  good  start  had  been  provided  the  solution 
would  have  converged  correctly  in  a few  iterations.  If  the  number  of  cines 
were  great  enough  in  comparison  to  the  number  of  bad  cines,  using  the  median 
of  the  solutions  obtained  from  the  cine  pairs  provides  an  acceptable  starting 
solution.  Unfortunately,  the  number  of  cines  is  often  no  more  than  three 
or  four.  In  the  case  of  three  cines  the  use  of  a starting  solution  predicted 
from  preceding  points  might  be  a desirable  procedure.  If  preprocessing  had 
been  used  on  the  cine  data  most  if  not  all  of  the  outliers  of  the  spike 
variety  in  the  cine  data  would  have  been  detected  before  attempting  a solution. 
Thus,  robust  estimation  in  the  solution  has  only  to  contend  with  detecting 
badly  biased  cines.  In  any  situation  with  three  ir  more  cines  with  one  bad 
cine,  the  robust  solution  will  usually  provide  a better  solution  than  the 
usual  least  square  procedure.  A strategy  for  choosing  a good  starting  solution 
needs  to  be  developed.  A robust  N-station  radar  solution  is  developed  along 
the  same  lines  as  a robust  cine  solution.  In  the  radar  case  a starting 
solution  for  the  iteration  is  somewhat  easier  to  obtain. 


Ll — A 


Azimuth 

Elevation 

.000008 

-.000064 

-.242553 

.011513 

.000022 

.000081 

.000057 

-.000019 

Application  to  Recursive  Filtering 


Very  little  development  has  been  done  on  the  application  of  robust 
statistical  techniques  to  filtering.  The  most  significant  effort  known 
to  the  author  is  given  in  the  paper  of  Masreliez  and  Martin  [7],  Their 
development  of  robustifying  the  Kalman  filter  is  quite  complex  and  will  not 
be  considered  here.  It  is  a simple  matter  to  specify  a form  for  an 
approximate  M-filter  and  its  covariance. 

Suppose  we  wish  to  estimate  the  state  x(n)  of  the  linear  dynamic 
model  described  by  the  state  equation 

x(n  + 1)  = 4>(n  + 1,  n)x(n)  + u(n)  (37) 

where  4>(n  + 1,  n)  is  an  mxm  state  transition  matrix  and  u(n)  is  an  m-vector 
of  state  noise  with  covariance  Qn.  Suppose  we  are  also  given  scalar 
observations  Z(n)  of  the  state  specified  by 

Z(n)  = Hx(n)  + v(n)  (38) 

where  H is  a lxm  matrix  of  constants  and  v(n)  is  observation  error.  By 
analogy  with  the  least  squares  filter  derivation  we  minimize 


E p Z(i)  - Hx(i) 


Subject  to  the  constraints 

x(i  + 1)  - $x(i)  - u^  = 0 i = 1,  n - 1 

Minimizing  (39)  leads  to  the  approximate  filter  equations 


+ 1 


u(i)Qi1u(i) 


(39) 


x(n  + 1/n  + 1)  = x(n  + 1/n)  + Pn  + 1H  / Z(n  + 1)  - Hx(n  + 1/n) 


n + 1 


n + 1 


x(n  + 1/n)  = 4>x(n) 
with  approximate  covariance 

-1 


V+  1 = Pn\-  1/n  + ^ A Z(n  + D - Hx(n  + 1/n) 

8n2  sn  + 1 


(40) 

(41) 

(42) 


P , , . = $P  4>  + Q 

n + 1/n  n n 


(43) 


Where  <!»(•)  is  an  appropriate  influence  function.  Note  that  the  derivative 
of  is  required  for  the  update  of  the  filter  covariance  matrix. 


This  robust  filter  is  certainly  easy  to  implement  and  anyone  who  has 
done  much  recursive  filtering  of  data  on  a computer  has  probably  implemented 
such  a filter  with  the  following  function. 


i.g.,  we  process  observations  only  if  the  predicted  residuals  are  within 
+ka  where  a is  an  estimate  of  the  measurement  noise  standard  deviation. 
Thus,  robust  filtering  presents  nothing  new  as  far  as  filter  implementation 
is  concerned,  but  we  are  now  in  a position  to  possibly  improve  our  robust 
filtering  by  borrowing  some  ^ functions  and  other  concepts  which  have 
proved  very  useful  in  robust  regression. 
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